49 research outputs found

    Transferring Procedural Knowledge across Commonsense Tasks

    Full text link
    Stories about everyday situations are an essential part of human communication, motivating the need to develop AI agents that can reliably understand these stories. Despite the long list of supervised methods for story completion and procedural understanding, current AI has no mechanisms to automatically track and explain procedures in unseen stories. To bridge this gap, we study the ability of AI models to transfer procedural knowledge to novel narrative tasks in a transparent manner. We design LEAP: a comprehensive framework that integrates state-of-the-art modeling architectures, training regimes, and augmentation strategies based on both natural and synthetic stories. To address the lack of densely annotated training data, we devise a robust automatic labeler based on few-shot prompting to enhance the augmented data. Our experiments with in- and out-of-domain tasks reveal insights into the interplay of different architectures, training regimes, and augmentation strategies. LEAP's labeler has a clear positive impact on out-of-domain datasets, while the resulting dense annotation provides native explainability

    ReferenceNet: a semantic-pragmatic network for capturing reference relations

    Get PDF
    In this paper, we present ReferenceNet: a semantic-pragmatic network of reference relations between synsets. Synonyms are assumed to be exchangeable in similar contexts and also word embeddings are based on sharing of local contexts represented as vectors. Co-referring words, however, tend to occur in the same topical context but in different local contexts. In addition, they may express different concepts related through topical coherence, and through author framing and perspective. In this paper, we describe how reference relations can be added to WordNet and how they can be acquired. We evaluate two methods of extracting event coreference relations using WordNet relations against a manual annotation of 38 documents within the same topical domain of gun violence. We conclude that precision is reasonable but recall is lower because the Word-Net hierarchy does not sufficiently capture the required coherence and perspective relations

    Magnetic nanostructures patterned by block copolymer lithography

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Materials Science and Engineering, 2008."June 2008."Includes bibliographical references.The aim of this research was twofold: understanding the methods of patterning magnetic films using self-assembled block copolymer masks and examining the magnetic reversal mechanisms of as deposited and patterned magnetic films. Ti / Co 66 at. % Cr 22 at. % Pt 12 at. % (CoCrPt) films with perpendicular magnetic anisotropy were deposited on silicon wafers by UHV sputtering. Ti was used as an adhesion layer and texture promoter so that the easy magnetic axis of Co is aligned perpendicular to the sample plane. Magnetic reversal of Ti/CoCrPt films and Ti/CoCrPt/Ti/CoCrPt pseudo spin valve films is a domain nucleation and growth process with a slow time-dependent magnetization reversal which was attributed to growth of reverse domains. The films were patterned into nanosized islands by block copolymer lithography using self assembled polystyrene-polyferrocenyldimethylsilane (PS-PFS) as a mask. The islands reverse their magnetization in a coherent and independent fashion (StonerWohlfarth reversal), in contrast to the continuous film. Micromagnetic simulation confirmed the coherent reversal of the thicker islands. Two graphoepitaxy methods were examined for inducing long range order (LRO) in block copolymers. Nanoimprint lithography with in-situ annealing was successful in guiding the self assembly of the block copolymers in the grooves, however, no LRO was achieved. Selectively removable polymeric templates fabricated out of BARL-i@ anti reflection coating guide the self-assembly of PFS domains with good LRO and very few defects over a large area. The ordered arrays were then transferred into silica and W, forming an ordered array of cp-packed W islands with period of 29 nm and island diameter of 17 nm. Transfer of the pattern into CoCrPt is difficult due to the nonselective ion beam etching process.by Filip Ilievski.Ph.D

    Identifying and Consolidating Knowledge Engineering Requirements

    Full text link
    Knowledge engineering is the process of creating and maintaining knowledge-producing systems. Throughout the history of computer science and AI, knowledge engineering workflows have been widely used because high-quality knowledge is assumed to be crucial for reliable intelligent agents. However, the landscape of knowledge engineering has changed, presenting four challenges: unaddressed stakeholder requirements, mismatched technologies, adoption barriers for new organizations, and misalignment with software engineering practices. In this paper, we propose to address these challenges by developing a reference architecture using a mainstream software methodology. By studying the requirements of different stakeholders and eras, we identify 23 essential quality attributes for evaluating reference architectures. We assess three candidate architectures from recent literature based on these attributes. Finally, we discuss the next steps towards a comprehensive reference architecture, including prioritizing quality attributes, integrating components with complementary strengths, and supporting missing socio-technical requirements. As this endeavor requires a collaborative effort, we invite all knowledge engineering researchers and practitioners to join us

    Using Visual Cropping to Enhance Fine-Detail Question Answering of BLIP-Family Models

    Full text link
    Visual Question Answering is a challenging task, as it requires seamless interaction between perceptual, linguistic, and background knowledge systems. While the recent progress of visual and natural language models like BLIP has led to improved performance on this task, we lack understanding of the ability of such models to perform on different kinds of questions and reasoning types. As our initial analysis of BLIP-family models revealed difficulty with answering fine-detail questions, we investigate the following question: Can visual cropping be employed to improve the performance of state-of-the-art visual question answering models on fine-detail questions? Given the recent success of the BLIP-family models, we study a zero-shot and a fine-tuned BLIP model. We define three controlled subsets of the popular VQA-v2 benchmark to measure whether cropping can help model performance. Besides human cropping, we devise two automatic cropping strategies based on multi-modal embedding by CLIP and BLIP visual QA model gradients. Our experiments demonstrate that the performance of BLIP model variants can be significantly improved through human cropping, and automatic cropping methods can produce comparable benefits. A deeper dive into our findings indicates that the performance enhancement is more pronounced in zero-shot models than in fine-tuned models and more salient with smaller bounding boxes than larger ones. We perform case studies to connect quantitative differences with qualitative observations across question types and datasets. Finally, we see that the cropping enhancement is robust, as we gain an improvement of 4.59% (absolute) in the general VQA-random task by simply inputting a concatenation of the original and gradient-based cropped images. We make our code available to facilitate further innovation on visual cropping methods for question answering.Comment: 16 pages, 5 figures, 7 table

    BRAINTEASER: Lateral Thinking Puzzles for Large Language Models

    Full text link
    The success of language models has inspired the NLP community to attend to tasks that require implicit and complex reasoning, relying on human-like commonsense mechanisms. While such vertical thinking tasks have been relatively popular, lateral thinking puzzles have received little attention. To bridge this gap, we devise BRAINTEASER: a multiple-choice Question Answering task designed to test the model's ability to exhibit lateral thinking and defy default commonsense associations. We design a three-step procedure for creating the first lateral thinking benchmark, consisting of data collection, distractor generation, and generation of adversarial examples, leading to 1,100 puzzles with high-quality annotations. To assess the consistency of lateral reasoning by models, we enrich BRAINTEASER based on a semantic and contextual reconstruction of its questions. Our experiments with state-of-the-art instruction- and commonsense language models reveal a significant gap between human and model performance, which is further widened when consistency across adversarial formats is considered. We make all of our code and data available to stimulate work on developing and evaluating lateral thinking models

    A Study of Slang Representation Methods

    Full text link
    Warning: this paper contains content that may be offensive or upsetting. Considering the large amount of content created online by the minute, slang-aware automatic tools are critically needed to promote social good, and assist policymakers and moderators in restricting the spread of offensive language, abuse, and hate speech. Despite the success of large language models and the spontaneous emergence of slang dictionaries, it is unclear how far their combination goes in terms of slang understanding for downstream social good tasks. In this paper, we provide a framework to study different combinations of representation learning models and knowledge resources for a variety of downstream tasks that rely on slang understanding. Our experiments show the superiority of models that have been pre-trained on social media data, while the impact of dictionaries is positive only for static word embeddings. Our error analysis identifies core challenges for slang representation learning, including out-of-vocabulary words, polysemy, variance, and annotation disagreements, which can be traced to characteristics of slang as a quickly evolving and highly subjective language
    corecore